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Abstract. Distinguishing logarithmic depth quantum circuits on mixed states is shown 
to be complete for QIP, the class of problems having quantum interactive proof systems. 
Circuits in this model can represent arbitrary quantum processes, and thus this result has 
implications for the verification of implementations of quantum algorithms. The distin- 
guishability problem is also complete for QIP on constant depth circuits containing the 
unbounded fan-out gate. These results are shown by reducing a QlP-complete problem to 
a logarithmic depth version of itself using a parallelization technique. 



1. Introduction 

Much of the difficulty in implementing quantum algorithms in practice is that qubits 
quickly decohere upon interacting with the environment. This entanglement destroying 
process limits the length of the computations that can be realized by experiment. Im- 
plementing quantum algorithms as circuits of low depth can provide a way to perform as 
much computation as possible within the limited time available, and for this reason there 
is significant interest in finding short quantum circuits for important problems. 

Log-depth quantum circuits have been found for several significant problems including 
the approximate quantum Fourier transform [3] and the encoding and decoding operations 
for many quantum error correcting codes [10]. In addition to these applications, a pro- 
cedure for parallelizing to log-depth a large class of quantum circuits has recently been 
discovered [2]. These examples demonstrate the surprising power of short quantum circuits. 

Much of the work on quantum circuits is done in the standard model of unitary quantum 
circuits on pure states. In this paper a slightly different model of computation is considered: 
the model of mixed state quantum circuits, introduced by Aharonov, Kitaev, and Nisan pQ. 
While much of the previous complexity-theoretic work on short quantum circuits has been 
in the unitary model [H [6], there has also been work outside of this model [13]. There 
are several advantages to considering the more general model of mixed state circuits. The 
primary advantage is that the mixed state model is able to capture any process allowed by 
quantum mechanics, so that results on this model may have implications for experimental 
work in quantum computing. The problem of distinguishing circuits may thus be thought 
of as the problem of distinguishing potentially noisy physical processes. As an example, 
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finding an error in an implementation of a quantum algorithm is simply the problem of 
distinguishing the constructed circuit from one that is known to be correct. 

Unfortunately, in this paper it is shown that the apparent power of short quantum 
computations comes with a price: logarithmic depth quantum circuits are exactly as difficult 
to distinguish as polynomial depth quantum circuits. This equivalence implies the surprising 
result that distinguishing log-depth quantum computations is complete for the class QIP, 
the set of all problems that have quantum interactive proof systems. As PSPACE C QIP C 
EXP [8], this result also implies that the problem is PSPACE-hard. 

The result on circuit distinguishability is shown using the closely related problem of 
determining if two circuits can be made to output states that are close together. This 
problem was introduced by Kitaev and Watrous [8] who show it to be both complete for 
QIP and contained in EXP. The main result of the present paper is obtained by reducing an 
instance of this problem of polynomial depth to an equivalent instance of logarithmic depth. 
This demonstrates that the problem of close images remains complete for QIP even under 
a logarithmic depth restriction. The hardness of distinguishing short quantum circuits is 
then demonstrated by a modification to the argument in [12] to show that the equivalence 
of close images problem and the distinguishability holds even for log-depth circuits. 

The remainder of this paper is organized as follows. In the next section, some of the 
notation and results that will be needed are summarized. This is followed by Section [3] 
where the complete problems for QIP are discussed. In Section [4] the reduction from the 
polynomial depth to logarithmic depth versions of the close images problem is given, and 
the correctness of this construction is shown in Section [5] The equivalence between the 
log-depth close images problem and the problem of distinguishing log-depth computations 
is discussed in Section [6j 

2. Preliminaries 

This section outlines some of the definitions and results that will be used throughout 
the paper. For a more thorough treatment of the concepts introduced here see [9] and [TT] . 

Throughout the paper scripted letters such as Ti will refer to finite dimensional Hilbert 
spaces, D(Ti) will denote the set of all density matrices on Ti, and U(7Y,/C) will denote 
the norm-preserving linear operators from Ti to K,. The proof of the main result will make 
extensive use of two notions of distance between quantum states. The first of these is the 
fidelity. The fidelity between two positive semidefinite operators X and Y on a space Ti 
can be defined as 

F(X,y) = max{|(^)| : \<j>), k/>) e Ti ® K,tt K = X,tt K = Y}. 

This definition is known as Uhlmann's Theorem, and it is used here as it is more directly 
applicable to the task at hand than the usual definition. As any purification of a state 
necessarily purifies the partial trace of that state, this equation implies that the fidelity is 
nondecreasing under the partial trace. This property is known as monotonicity and can be 
stated more formally as F(X,Y) < i^(tr^; X, trx; Y) where X, Y are positive semidefinite 
operators on Ti (g) fC. The final property of the fidelity that will be needed is the result that 
the maximum fidelity of any outputs of two transformations is multiplicative with respect 
to the tensor product. This result can be found in [9] (see Problem 11.10 and apply the 
multiplicativity of the diamond norm with respect to the tensor product). 



DISTINGUISHING SHORT QUANTUM COMPUTATIONS 



3 



Theorem 2.1 (Kitaev, Shen, and Vyalyi [9]). For any completely positive transformations 
$1, <£ 2 , ^l, ^2 on states in H 

max F((*i®$ 2 )(p),(*i®* a )(0)= II maxF(<F 4 (p), *.,(£)) 

The second notion of distance that will be used is the trace norm, which can be defined 
for any linear operator X by || -X" || t = tr y/X*X, or equivalently as the sum of the singular 
values of X. This quantity is a norm, and so in particular it satisfies the triangle inequality. 
Similar to the fidelity, the trace norm is monotone under the partial trace, though in this 
case the trace norm is non-increasing under this operation. The proofs that follow will make 
essential use of the Fuchs-van de Graaf Inequalities [5] that relate the trace norm and the 
fidelity. For any density operators p and £ on the same space, these inequalities are 

i - f( p ,o < \ \\p-t\\ tT < Vi-f(p,0- 

In addition to these measures on quantum states, it will be helpful to have a distance 
measure on quantum transformations. One such measure is the diamond norm, which for 
a completely positive transformation <3? on density operators on Ti is given by 

ll*IL= SU P ||($^L(H))(p)|| tr - 

peD(«®«) 

This norm is essential when considering transformations as it represents the distinguisha- 
bility of two transformations when a reference system is taken into account. The simple 
supremum of the trace norm over all inputs to the channel is not stable under the addition 
of a reference system, and so the diamond norm is used in place of the simpler one. More 
properties and a more thorough definition of this norm can be found in [9] . 

The circuit model that will be used in this paper is the mixed state model introduced 
by Aharonov, Kitaev, and Nisan pQ. Circuits in this model are composed of qubits that are 
acted upon by arbitrary trace preserving and completely positive operations. This model 
allows for non-unitary operations, such as measurement or the introduction of ancillary 
qubits, to occur in the middle of the circuit. It is important to note that this model 
captures any physical process that quantum mechanics allows, and so in particular, any 
computation that can be done on mixed states with measurements can be represented in 
this model. Fortunately this model is polynomially equivalent to the standard model of 
unitary quantum circuits (with ancilla) followed by measurement, as shown in pp. This will 
allow us to consider only circuits composed of unitary gates from some finite basis of one 
and two qubit gates with the additional operations of introducing qubits in the |0) state 
and measuring in the computational basis. This restriction can be strengthened, again with 
no loss of generality, to assume that all ancillary qubits are introduced at the start of the 
circuit and that all measurements are performed at the end. 

We will often add to this circuit model one additional gate: the unbounded fan-out 
gate. This gate, in constant depth, applies a controlled-not operation from one qubit to an 
arbitrary number of output qubits. It is not clear that this gate is a reasonable choice in a 
standard basis of gates, and so it will be clearly marked when this gate is allowed into the 
circuit model under consideration. As an example of the power of this gate it can be used 
to build a constant depth circuit for the approximate quantum Fourier transform [7] . This 
gate is considered here for the sole reason that if it is included in the standard set of gates, 
the main result will also hold for constant depth circuits. 
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Figure 1: A circuit implementing the swap test. 

For spaces TL and K, of the same dimension, we use W E \J(H (8> £>, "H <8> fC) to represent 
the operation that swaps the states in the two spaces. As W is a permutation matrix when 
expressed in the computational basis, and the permutation that it encodes is composed ex- 
clusively of transpositions, the swap operation is both hermitian and unitary. Furthermore, 
W can easily be implemented in constant depth, as all of the required transpositions can 
be performed in parallel. This operator is the essential component of the swap test, where 
a controlled W operation is used to determine how close two states are to each other. A 
circuit performing the swap test is given in Figure [1] where the measurement is performed in 
the computational basis. Another way to view the swap test is as a projective measurement 
onto the symmetric and antisymmetric subspaces. The projections in this measurement are 
given by (I + W)/2 and (/ — W)/2. This formulation of the swap test is equivalent to the 
circuit presented in Figure [TJ 

It is not immediately clear how a controlled operation on n qubits, such as the controlled- 
swap operation used in the swap test, can be performed in depth logarithmic in n. The 
straightforward implementation requires using one control qubit to control each of the gates 
in the operation. However, Moore and Nilsson [10] give a simple construction that allows 
such an operation to be performed in log-depth. 

Proposition 2.2 (Moore and Nilsson). Any log depth operation on n qubits controlled by 
one qubit can be implemented in O(logra) depth with 0{n) ancillary qubits. 

Moore and Nilsson prove this only for the constant depth case, but the method of proof 
that they use immediately extends to the log depth case. They prove this proposition by 
using a tree of log ra controlled-not operations to 'duplicate' the control qubit. These copies 
can then be used to control the remaining operations, with each control qubit used at most 
a logarithmic number of times. This proposition, as an example, implies that the swap test 
circuit on n qubits shown in Figured] can be implemented in depth O(logra). 

If the fan-out gate is allowed into the standard basis of gates, then controlled operations 
can be performed with only constant depth overhead. A circuit that performs this can be 
obtained by simply using one fan-out gate to make n copies (in the computational basis) of 
the control qubit onto ancillary qubits. These 'copies' may then be used to control each of 
the n operations, with a final application of the fan-out gate to restore the ancillary qubits 
to the |0) state. As controlled operations will be the only place that the circuits constructed 
here exceed constant depth, this will allow the proof of the main result for constant depth 
circuits with fan-out. 
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3. Complete Problems for QIP 

The Close linages problem, defined and shown to be complete for QIP in [8] can be 
stated as follows. 

Close Images. For constants < b < a < 1, the input consists of quantum circuits Q\ and 
Q2 that implement transformations from 7i to 1C. The promise problem is to distinguish the 
two cases: 

Yes: F(Qi(p),Q 2 (0) > a for some p,£ £ D(W), 

No: F(Qi(p), Q 2 (0) < b for all p,£ G D(W). 
This is simply the problem of determining if there are inputs to Q\ and Q2 that cause them 
to output states that are nearly the same. It will be helpful to abbreviate the name of this 
problem as C\ ab . 

A closely related problem is that of distinguishing two quantum circuits. This problem 
was introduced and shown complete for QIP in [12] . 

Quantum Circuit Distinguishability. For constants < b < a < 2, the input consists 
of quantum circuits Q\ and Q2 that implement transformations from 7i to K,. The promise 
problem is to distinguish the two cases: 

Yes: || Qi - Q 2 || > a, 
No: \\Qi-Q 2 h<b. 

Less formally, this problem asks: is there an input density matrix p on which the circuits 
Qi and Q2 can be made to act differently? This problem will be referred to as QCD a ^. 

It is our aim to prove that these problems remain complete for QIP when restricted 
to circuits Q\ and Q2 that are of depth logarithmic in the number of input qubits. This 
will be achieved in the case of perfect soundness error, i.e. a = 1,2 in the above problem 
definitions. Both of these problem remain complete for QIP in this case. This restriction 
serves only to make these problems easier, as distinguishing the two cases for a weaker 
promise can only be more difficult, so the results of this paper will also imply the hardness 
of the more general problems. The log-depth versions of these problems will be referred to 
as Log-depth C\ 1 b and Log-depth QCD 2 and since these are restrictions of QlP-complete 
problems it is clear that they are also in QIP. Similarly, the abbreviations Const-depth C\ 1 b 
and Const-depth QCD 2 & for the versions of these problems on constant-depth circuits will 
be convenient. 

4. Log-Depth Construction 

In this section the reduction from the general C\±b problem to the log-depth restriction 
of the problem is described. The general idea behind the construction is to simply slice 
the circuits of an instance of CI15 into logarithmic-depth pieces and run them in parallel. 
These circuits will require more input, but if each piece of the circuit is given as input the 
same state output by the previous piece, then the output of the last piece of the circuit will 
be equal to the output of the original circuit. This may not be the case if the intermediate 
inputs are not the outputs of the previous pieces, and so additional tests that ensure these 
inputs are at least close to the desired states are required. 

To describe the reduction, let Q\ and Q2 be the circuits from an instance of Cli^, 
and let n be the size (number of gates) of Q\ and Q2 (by padding the smaller circuit, if 
necessary). In order to perform the slicing of the circuit into pieces it is assumed that Q\ 
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Figure 2: The original circuits Q\ and Q2 decomposed into constant depth unitary circuits. 



and Q2 first introduce any necessary ancillary qubits, then apply local unitary gates, and 
finally trace out any qubits that are not part of the input. This restriction can be made 
with no loss in generality, as any quantum circuit, even one that incorporates measurements 
and other non-unitary operations, can be approximated by such a circuit, and furthermore, 
this circuit uses a number of gates that is a polynomial in the size of the original circuit [I] . 

A simple way to decompose Q\ into constant depth pieces is to simply let each gate of 
Qi be a piece in the decomposition. Let U\, U2, ■ ■ ■ , U n be these pieces, with the additional 
complication that the operation U\ both adds the ancillary qubits and performs the first 
gate of the circuit. In a similar way, Q2 can be decomposed into constant depth pieces 
Vl, V2, ■ ■ ■ , V n . These pieces are shown in Figure [2j If Q\ and Q2 implement transformations 
from TL to /C, using ancillary qubits that fit into A, and trace out the qubits in B, then the 
spaces Ti <£> A and B <g> K, are isomorphic, since by assumption Q\ and Q2 first introduce 
any needed ancilla and only trace qubits out at the end of the computation. Using these 
spaces, and implicitly this isomorphism, we have 

U u Vi G U(Wi,Bi®£i) 

Ui, Vi G U(Hi <g) Ai, Bi <8> Ki) for 2 < % < n, 

where the subscripted spaces are copies of the non-subscripted spaces that hold the input 
or output of one of the pieces of the original circuits. As an example of this notation, if 
p G T)(7i), then the output of Q\ on p is given by 

tT Bn u n u n - l ---u 1 pU* l u*2---K, 

and the output of Q2 is given by the same expression using the Vi operators. 

Using this decomposition of Q\ and Q2, circuits C\ and C2 are constructed that are 
logarithmic in depth and still in some sense faithfully implement Q\ and Q2- This is done 
by running the circuits corresponding to U\,...,U n in parallel, and tracing out all the 
qubits that are not in the output of U n . Such a circuit is constant depth, but does not 
necessarily output a state in the image of Q\, as the input to Ui is not necessarily close to 
the output from U-i. This problem can be dealt with by comparing the output of f/j-i 
to the input to U. In order to do this in logarithmic depth an auxiliary input that is first 
compared against the input to U and then held in reserve to compare to the output of f/j-i 
is needed. To compare these quantum states the swap test can be used. This test will fail 
with some probability depending on the distance between the two states. An example of 
the construction used to ensure that the output of U-i agrees with the input to U is given 
in Figure El To simplify the analysis of the constructed circuits these tests are controlled 
so that either one or the other is performed. This will affect the failure probability by a 
factor of at most two, but will allow the analysis of each swap test to ignore the effect of the 
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Figure 3: Testing that the output of XJ{ is close to the input of Ui + \. The inputs \ipj) are 
the ideal inputs to Uj , and are labelled for clarity only - no assumptions are made 
about these states. Qubits that do not reach the right edge are traced out. 



other. To implement this a control qubit is used so that either the first or the second test is 
performed between every two pieces C/j, L^+i of the circuit. If a test is not performed, then 
the value of the output qubit of the swap test is left unchanged, and so the result of the test 
is a qubit in the |0) state. These controlled operations can be implemented in logarithmic 
depth using the technique of Moore and Nilsson [10] . 

After adding these tests between each piece of the circuit there is one final modification 
required. If any of the swap tests fail, i.e. detect states that are not the same, then they 
will output qubits in the |1) state. As yes instances of Cli^ have outputs that are close 
together, we can ensure that no outputs of the constructed circuits can be close if any swap 
tests fail by adding dummy qubits in the |0) state to be compared to the outputs of the 
swap tests in the other circuit. These dummy qubits are shown in Figured! 

The constructed circuits C\ and C2 are obtained by decomposing Q\ and Q2 into 
constant depth pieces, inserting the swap tests shown in Figure El and adding dummy 
qubits to ensure that the swap tests in the other circuit do not fail. At the end of these 
circuits, all qubits are traced out, except the output (in the space /C n ) of U n or V n , the 
output of the swap tests, and the dummy zero qubits. If the outputs of C\ and C2 are close 
together, then intuitively the output of the swap tests in each circuit must be close to zero 
and the output of U n and V n must also be close. If the swap tests do not fail with high 
probability (i.e. the outputs are close to zero), then these circuits will more or less faithfully 
reproduce the output of Q\ and Qi- Thus, in the case that the outputs of C\ and C2 can 
be made close, we will be able to argue that the output of Q\ and Q2 can also be made 
close. Proving that this intuitive picture is accurate forms the content of the next section. 

In the other direction, it is not hard to see that if there are states p, £ € D(W) such that 
Qi(p) = Q2{€)i then there are similar states for the constructed circuits C\ and C2. To do 
this, notice that the circuit construction does not change if additional qubits are added to 
the circuits to allow purification of the states p and £ to be used as inputs to C\ and C2. 
These additional qubits are traced out with the other qubits at the end of the circuit, so 
that the output state of the circuit are not changed. As these purifications are pure states 
and all operations performed during the circuit are unitary, the intermediate states of the 
circuits must also be pure states. If the input state to C\ is l^), then by providing the state 
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Figure 4: The outputs of C\ and C^. 

as input to C\ , the output of each block of the circuit will be identical to the input to the 
next block, ensuring that all the swap tests will succeed with probability one. It remains 
only to check on such input states that C\ produces the same output as Q\ on p. This can 
be observed by noting that the output of the circuit is exactly 

ti Bn U n U n ^ - ■ - u lP uiu* 2 ■ ■ - u* n , 

which by construction is equal to the output of Q\ on p. Thus if the circuits Q\ and Q2 
have intersecting images then so do the circuits C\ and C%. This observation proves the 
completeness of the construction. Soundness is considerably more intricate, and is the focus 
of the next section. 



Co 



Swap tests 

■\0f n 
Output of V n 



5. Soundness of the Construction 

In this section it is demonstrated that if the images of the original circuits Q\ and Q2 
are far apart then so must be the images of the constructed circuits C\ and C2. As the 
constructed circuits essentially simulate Q\ and Q2 the desired result can be obtained by 
arguing that either the outputs of C\ and C2 are far apart or the input to at least one of the 
constructed circuits is not a faithful simulation of the corresponding original circuit. In the 
case that this simulation is not faithful it will be shown that there is some swap test that 
fails with reasonable probability. This implies that outputs of the constructed circuits must 
also be distant, as the failing swap test produces a state of the form (1 — p)|0)(0| 
that has low fidelity with the corresponding dummy zero qubit of the other circuit. 

As a first step, we place a lower bound on the failure probability of a swap test in terms 
of the fidelity of the two states being compared. In the following lemma the swap test is 
viewed as a measurement of the symmetric and antisymmetric projectors, with the outcome 
that produces a qubit in the state |1) corresponding to the antisymmetric case. 

Lemma 5.1. If p £ ~D(A<£>B) then a swap test on A®B returns the antisymmetric out- 
come with probability at least 

1 1 

2 ~ 2 F ( tr - 4fttre ^- 

Proof. Let \ip) 6 A® B <g> C be a purification of p, where C is an arbitrary space of sufficient 
dimension to allow such a purification. The swap test measures the state on A <8> B with 
the projectors \{I — W) and \{I + W), where W is the swap operator on A <8> B. Thus, the 
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antisymmetric outcome occurs with probability 
^tv({(I -W)®I]\ip)(ip\[(I -W*)® J]) = 1 (V| J <8) I- W®I\ip) = ^-Uip\W®I\i/j), 

as W is hermitian. Then as W is also unitary, the states and W|^>) each purify both 
tr^4®c \if}){if}\ and trg^c and so by Uhlmann's theorem 

1 _ l^\ W ,g /|^) > I _ 1 F(tr^8c |^)(V|,tw |^)(^|). 

After tracing out the space C, this is exactly the statement of the lemma. ■ 

This lemma cannot be immediately applied to the circuits C\ and C2, as in these circuits 
the output of one block of the circuit is not directly compared to the input to the next block, 
but instead each of these states are with probability 1/2 compared to some intermediate 
value. In order to deal with this difficulty, we use the Fuchs-van de Graaf inequalities to 
translate the fidelity to a relation involving the trace norm, which we can then apply the 
triangle inequality to. This application of the triangle inequality shows that at least one 
of the two swap tests fails with probability bounded below by an expression involving the 
fidelity. In the following corollary the reduced states of various parts of the input to either 
of the circuits C\ or C2 are used, but it is not assumed that these states are given in a 
separable form. For instance, the density matrices Pi,(Ti, and £j that appear in the lemma 
may be part of some larger entangled pure state, so that the failure probabilities of the two 
swap tests need not be independent. 

Corollary 5.2. // is input to the circuit C a for a E {1,2}, with pi the reduced state 
of on TCi <g> Ai, then at least one of the swap tests on the ith block of C a fails with 

probability at least 

1 2 

— \\UiPi-iUi -pi\\ tT . 

Proof. In the ith block of C a there are two inputs to the first swap test: let the reduced 
density operators of these inputs be pi and <7j. The inputs to the second swap test are then 
given by cij and UiPi^\U* = As exactly one of these tests is performed we do not need 
to consider the effect of the first test on the state when considering the second test, and so 
the same input state <7j is used in both swap tests. 

By Lemma 15.11 the failure probability of first and second tests, when performed, are 
at least ^(1 — F(pi,o~i)) and ^(1 — F(cii,^)), respectively. Thus, the probability p that at 
least one of these tests fails, given that each of them is performed with probability 1/2, is 
at least 

p > i max 1 1(1 - F(o-i, &)), 1(1 - F( Pi , =\(1- min{F(^, &)), F( Pi , a,)}) . 



By the Fuchs-van de Graaf inequalities, this fidelity may be replaced by the trace norm. 
Doing so, we obtain 

P > — max(|| ^ - & || t 2 r ,\\fH- en \\ 2 tT ). 
Finally, as this maximum must be at least the average of the two values, 

„■> 1 ( \Wi -&lltr 1 lift -^lltr V > 1 .1 „ t ||2 

where the last inequality follows from an application of the triangle inequality. ■ 
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By repeatedly applying some of the properties of the trace norm discussed in Section [2] 
it is somewhat tedious but not difficult to reduce the problem at hand to the previous 
Corollary. This is the content of the following theorem. 

Theorem 5.3. If F(Q 1 (p ),Q 2 {£ )) < 1 - c for all p ,£o G H then 
for all p,£ e (H®A)® 2n . 

Proof. Let p and £ be inputs to C\ and C 2 , and let pi, £j be the reduced states of these 
inputs on 7ii ® A% for < i < 2n, where the states for i > n are the inputs that are only 
used by the swap tests, which we will not need to refer to explicitly. That is, pi and £j for 
< i < n are the portions of the state that are input to the unitaries Ui and V{ that make 
up the circuits Q\ and Q 2 . The output of the circuits C\ and C 2 is then given by a number 
of qubits corresponding to the swap tests as well as the states tr# n p n and tr# n £ n , where B n 
is simply the space that is traced out to obtain the output from the unitary representations 
of the original circuits. 

By the condition on the fidelity of Q\ and Q 2 and the Fuchs-van de Graaf inequalities, 
we have 2c < ||Qi(po) — Q2(£o)|lt r • Using the triangle inequality we can relate this to the 
distance between the constructed circuits. Adding terms and simplifying, we obtain 

2c < || Qi(po) - tr Bn p n + tr B „ £ n - Q 2 (£o) + ^B n Pn - tr Bn £ n || tr 

< IIQi(Po) - tr Bn 

Pn 1 1 tr + ||tr Bn £, n - Q 2 (£o)Htr + ll tr Bn Pn ~ tr Bn Cn|| tr ■ 

We now observe that ||tre n p n — tr# n Cn|| tr — || Ci (p) — C 2 (£) || tr by the monotonicity of the 
trace norm under the partial trace, since the former can be obtained from the later by 
tracing out the appropriate spaces. Using this we have 

2c < ||Qi(po) - tr Bn p n \\ tl + ||tr B „ £ n - Q 2 (fo)Htr + \\dip) ~ ^(6 lltr (5-1) 
As the three terms on the right are nonnegative, at least one of them must be larger than 
the average 2c/3. If \\C x {p) - C 2 (£)lltr > 2c / 3 then F (Ci(p),C 2 (£)) < l-c 2 /U4 and there 
is nothing left to prove. 

The cases where one of the first two terms of f)5. If) exceeds 2c/3 are symmetric, and so 
we can consider only the first term. Expanding Qi(po) in terms of the Ui, we obtain 

y < IIQi(Po) -tr B „p n || tr 

= ||tr Sn U n U n -i ■ ■ ■ Uip UlU 2 ■■■UI- tr Bn p n || tr 
< || U n U n -i ■ ■ ■ UipoUfU; ---K-Pn || tr , 

where once again the monotonicity of the trace norm under the partial trace has been used. 
By repeating the strategy of adding terms and then applying the triangle inequality we have 
2c 

— < || UipoUf - pi || tr + || UnU n -l ■ ■ ■ U 2Pl UlUl ■■■U^- p n || tr . 

Here we have made use of the unitary invariance of the trace norm to discard the operators 
U 2 , . . .U n from the first term. Continuing in this fashion we have 
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As all terms in this sum are nonnegative, there must be at least one term in the sum that 
exceeds 2c/(3n), as this is a lower bound on the average of all terms. Thus, for some value 
of i, we have || UiPi-iU* — pi || tr > 2c/ (3n), and so by Corollary [52] one of the corresponding 
swap tests fails with probability p > c 2 /(144n 2 ). The qubit representing the output value 
of this swap test is then of the form (1 — p)|0)(0| + p|l)(l|, and so, by the monotonicity of 
the fidelity under the partial trace, 



By combining Theorem 15,31 with the observation in Section U] and the multiplicativity 
of the maximum output fidelity of two transformations, we obtain the following result. 

Corollary 5.4. The problem Log-depth b is QIP -complete for any constant < b < 1. 

Proof. Theorem 15.31 establishes the completeness of the problem for any b > 1 — c 2 /(144n 2 ), 
where n is an upper bound on the size of the circuits. Using Theorem 12.11 of Kitaev, 
Shen, and Vyalyi [H] we can repeat each of the circuits r times in parallel to obtain the 
completeness of the problem for b > (l — c 2 /(144n 2 )) , which can be made smaller than 
any constant for r some polynomial in n. ■ 

As the circuits constructed by the reduction only make use of logarithmic depth when 
performing swap tests, and the controlled swap operations performed by these tests can 
be accomplished in constant depth using unbounded fan-out gates, the following Corollary 
follows immediately from the previous one. 

Corollary 5.5. The problem Const-depth b on circuits with the unbounded fan- out gate 
is QIP -complete for for any constant < b < 1. 

6. Distinguishing Log-Depth Computations 

The hardness of Log-depth C\ lb can be extended to Log-depth QCD 2 b by observing 
that the reduction for the polynomial depth version of this problem in [12] can be made to 
preserve the depth of the constructed circuits. Once this observation is made, the hardness 
of the log-depth (and constant-depth with fan-out) versions of the circuit distinguishability 
problem is immediate. 

The reduction in [12] takes as input circuits (Qi,Q2) and produces circuits C\ and 
C2. Without describing the reduction in detail, the constructed circuits C\ and C2 run, 
depending on the value of a control qubit, one of Q\ and Q2, followed by a constant depth 
circuit. If the input circuits Q\ and Q2 have logarithmic depth, then the only significant 
difficulty is the fact that controlled versions of these circuits are needed. However, as we 
have already seen, if we replace the gates in Q\ and Q2 with controlled versions, then we 
can use the scheme of Moore and Nilsson [10] to implement the controlled operations in 
logarithmic depth. With this modification, the reduction in [12] can be reused to show the 
hardness of the QCD problem on log-depth circuits. 

Corollary 6.1. Log-depth QCD 2fe is QlP-complete for any constant < b < 2. 

Once again these controlled operations can be implemented in a constant depth circuit 
if the unbounded fan-out gate is allowed into the set of allowed gates. 



F(d(^),C 2 (e)) < F((l -p)|0)(0| |0)<0|) = 1 -V < 1 




as in the statement of the theorem. 
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Corollary 6.2. Const-depth QCD 2 & on circuits with the unbounded fan- out gate is QIP- 
complete for any constant < b < 2. 

7. Conclusion 

The hardness of distinguishing even log-depth mixed state quantum circuits leaves 
several related open problems, a few of which are listed here. 

• Can this new complete problem be used to further understand QIP? 

• Does this result rely in an essential way on the mixed state circuit model? How 
difficult is it to distinguish quantum circuits in less general models of computation? 

• What is the complexity of distinguishing constant depth quantum circuits that do 
not use the unbounded fan-out gate? 
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